ToBI accent type recognition
نویسنده
چکیده
This paper describes work in progress for recognizing a subset of ToBI intonation labels (H*, L+H*, L*, !H*, L+!H*, no accent). Initially, duration characteristics are used to classify syllables as accented or not. The accented syllables are then subclassified based on fundamental frequency, F0, values. Potential F0 intonation gestures are schematized by connected line segments within a window around a given syllable. The schematizations are found using spline-basis linear regression. The regression weights on F0 points are varied in order to discount segmental effects and F0 detection errors. Parameters based on the line segments are then used to perform the subclassification. This paper presents new results in recognizing L*, L+H*, and L+!H* accents. In addition, the models presented here perform comparably (80% overall, and 74% accent type recognition) to models which do not distinguish bitonal accents.
منابع مشابه
On the automatic toBI accent type identification from data
This contribution faces the ToBI accent recognition problem with the goal of multiclass identification vs. the more conservative Accent vs. No Accent approach. A neural network and a decision tree are used for automatic recognition of the ToBI accents in the Boston Radio Corpus. Multiclass classification results show the difficulty of the problem and the impact of imbalanced classes. A study of...
متن کاملSpeech Technology, ToBI, and Making Sense of Prosody
The current paper critically examines why prosodic knowledge has not yet found its way into commercial applications of speech technology. As a key issue of potential improvements to speech recognition and synthesis we identify the capability of understanding and expressing meaning by means of prosodic features of speech. We suggest that even a complete and ‘correct’ ToBI transcription will alwa...
متن کاملA corpus-based analysis of transfer effects and connected speech processes in Vietnamese English
This paper presents a corpus-based descriptive analysis of the most prevalent transfer effects and connected speech processes observed in a comparison of 11 Vietnamese English speakers (6 females, 5 males) and 12 Australian English speakers (6 males, 6 females) over 24 grammatical paraphrase items. The phonetic processes are segmentally labelled in terms of IPA diacritic features using the EMU ...
متن کاملAn Intonational Phrase Boundary and Pitch Accent Dependent Speech Recognizer
Does prosody help word recognition? In this paper, we propose a novel probabilistic framework in which word and phoneme are dependent on prosody in a way that improves word recognition. We describe the idea of prosody dependent speech recognition by building a prosody dependent speech recognizer that conditions word and phoneme models on two important prosodic variables: intonational phrase bou...
متن کاملA comparison of inter-transcriber reliability for two systems of prosodic annotation: rap (rhythm and pitch) and toBI (tones and break indices)
Agreement was investigated among five labelers for the use of two prosodic annotation systems: the ToBI (Tones and Break Indices) system [1,2] and the RaP (Rhythm and Pitch) system [3]. Each system permits the labeling of pitch accents and two levels of phrasal boundaries; RaP also permits labeling of speech rhythm and distinguishes multiple levels of prominence on syllables. After training wit...
متن کامل